A L2 Discrepancy Learning Process with Applications to Outlier and Insider Detections With Large High-Dimensional Data

نویسنده

  • Faysal El Khettabi
چکیده

In this paper, a discrepancy-based framework is first presented for outlier and insider detections purpose. Given any sequence of profiles, a local discrepancy first identifies regions where the profiles are clumped or scarce then a global L2 discrepancy summarizes the overall distribution patterns of the data into one real value. A L2 discrepancy learning process is formulated to rank each profile in the sequence on the basis of optimizing the L2 discrepancy value. This L2 discrepancy learning process allows an access to many levels of information about outliers and insiders in the data. Experimental results are given to demonstrate the application of the L2 discrepancy learning process with different features data sets showing that the algorithm efficiently detects the outliers and insiders in the data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Outlier detection for high dimensional data pdf

Is particularly useful for high dimensional data where outliers cannot be found.High dimensional data in Euclidean space pose special challenges to data. In about just the last few years, the task of unsupervised outlier detection has found.Outlier detection is an outstanding data mining task referred to open pdf with mac word class="text" href="https://tokiqivy.files.wordpress.com/2015/06/opel...

متن کامل

Outlier Detection in Random Subspaces over Data Streams: An Approach for Insider Threat Detection

Insider threat detection is an emergent concern for industries and governments due to the growing number of attacks in recent years. Several Machine Learning (ML) approaches have been developed to detect insider threats, however, they still suffer from a high number of false alarms. None of those approaches addressed the insider threat problem from the perspective of stream mining data where a ...

متن کامل

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

On the Interplay of Self-Esteem, Proficiency Level, and Language Learning Strategies Among Iranian L2 Learners

It is axiomatic that L2 teaching and learning is a process that requires dynamic involvement of L2 learners in the acquisition of knowledge and skills. L2 learners need to be assisted in setting individual learning goals. They should also be given the exposure to and guidance in effective language learning strategies (LLSs) in order to build a high level of confidence in the learning process. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006